Author(s): Zongcheng Li; Yasi Zhang
Reviewer(s): Ying Ge
Date: 2025-05-20

需求描述

Demand description

画出这种连线图。

Draw this connection diagram.

出自:https://molecular-cancer.biomedcentral.com/articles/10.1186/s12943-021-01322-w,跟FigureYa260CNV出自同一篇文章

图5 与WM评分相关的转录及转录后调控特征。 a TCGA-COAD/READ队列中WM评分高/低组间miRNA靶向信号通路的差异。红线表示高WM评分组中低表达的miRNA,蓝线表示低WM评分组中高表达的miRNA。红点对应高WM评分组中高表达的miRNA靶基因,蓝点对应低WM评分组中高表达的miRNA靶基因。圆圈代表靶基因富集的信号通路。

Source: https://molecular-cancer.biomedcentral.com/articles/10.1186/s12943-021-01322-w, from the same article as FigureYa260CNV.

Fig. 5 Transcriptional and post-transcriptional characteristics associated with the WM_Score. a Differences in miRNA-targeted signaling pathways in the TCGA-COAD/READ cohort between the WM_Score-high and -low groups. The red line represents a low expression of miRNA in the high WM_Score group, and the blue line represents a high expression of miRNA in the low WM_Score group. Red dots correspond to miRNA-targeted genes highly expressed in the high WM_Score group, and blue dots correspond to miRNA-targeted genes highly expressed in the low WM_Score group. The circle represents a signaling pathway enriched with targeted genes.

类似的图:

A similar image:

出自:https://doi.org/10.1038/s42255-019-0045-8,跟FigureYa174squareCross、FigureYa199crosslink、FigureYa256panelLink出自同一篇文章。这篇文章以连线著称,总是被模仿,不知道会不会被超越。

图3 | 倾向性评分算法概述及跨癌种低氧相关分子模式。 c图展示1,074个癌细胞系中低氧相关基因mRNA表达水平与药物敏感性之间的斯皮尔曼等级相关性。x轴上的深绿色圆点代表低氧相关基因;橙色圆点表示按不同信号通路聚类的药物。橙色圆点的大小反映与药物敏感性相关基因的数量(|rs| > 0.3且FDR < 0.05);条形图显示与基因存在相关性的药物数量。粉色与青色线条分别表示正相关与负相关。JNK指Jun N末端激酶。

Source: https://doi.org/10.1038/s42255-019-0045-8, from the same article as FigureYa174squareCross, FigureYa199crosslink, and FigureYa256panelLink. This paper is renowned for its connection designs—constantly imitated, yet to be surpassed.

Fig. 3 | overview of the propensity score algorithm and the hypoxia-associated molecular patterns across cancer types. c, Association between mRNA expression levels of hypoxia-associated genes and drug sensitivity across 1,074 cancer cell lines by Spearman’s rank correlation. The dark green dots along the x axis indicate hypoxia-related genes; the orange dots denote drugs that are clustered by different signalling pathways. The size of the orange dot indicates the number of genes correlated with drug sensitivity (|rs| > 0.3, FDR < 0.05); the bar plot shows the number of drugs correlated with the genes. The pink and cyan lines indicate positive and negative correlation, respectively. JNK, Jun N-terminal kinase.

应用场景

Application scenarios

展示miRNA-靶基因(或基因-药物等)的关系,连线和节点的颜色代表节点类型(例如例文的high和low WM_Score)。同一通路的基因画在同一圆圈里,并标注通路名。

为了画这个图,完善了crosslink包,该R包会继续添加更多有趣的连线功能,感兴趣可前往https://github.com/zzwch/crosslink查看最新版本及功能,在github上还能提交issue跟作者直接交流。

This figure displays miRNA-target gene (or gene-drug, etc.) relationships, where the colors of connecting lines and nodes represent node types (e.g., high vs. low WM_Score as shown in the example). Genes from the same pathway are grouped within circular clusters labeled with pathway names.

To create this visualization, we enhanced the crosslink R package, which will continue to incorporate more innovative connection features. Those interested can visit https://github.com/zzwch/crosslink to explore the latest version and functionalities. GitHub also allows users to submit issues for direct communication with the author.

环境设置

Environment Setup

使用国内镜像安装包。

Using domestic mirrors for package installation.

options("repos"= c(CRAN="https://mirrors.tuna.tsinghua.edu.cn/CRAN/"))
options(BioC_mirror="http://mirrors.tuna.tsinghua.edu.cn/bioconductor/")

# 安装crosslink(确保按照最新版本)
# Install crosslink (ensure you're using the latest version)
# remotes::install_github("zzwch/crosslink", build_vignettes = TRUE)

加载包

Loading packages

library(magrittr)
library(tidyverse)
## ── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
## ✔ dplyr     1.1.4     ✔ readr     2.1.5
## ✔ forcats   1.0.0     ✔ stringr   1.5.1
## ✔ ggplot2   3.5.2     ✔ tibble    3.3.0
## ✔ lubridate 1.9.4     ✔ tidyr     1.3.1
## ✔ purrr     1.0.4     
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ tidyr::extract()   masks magrittr::extract()
## ✖ dplyr::filter()    masks stats::filter()
## ✖ dplyr::lag()       masks stats::lag()
## ✖ purrr::set_names() masks magrittr::set_names()
## ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
library(ggplot2)
library(crosslink) 
## 
## Attaching package: 'crosslink'
## 
## The following object is masked from 'package:purrr':
## 
##     list_along
# 显示英文报错信息
# Show English error messages
Sys.setenv(LANGUAGE = "en") 

# 禁止chr转成factor
# Prevent character-to-factor conversion
options(stringsAsFactors = FALSE) 

输入文件

Input Files

easy_input_links.csv,连线表示source(miRNA)和target(靶基因)的关系。连线的颜色表示source的类型source_type(high WM_Score和low WM_Score)。

easy_input_nodes.csv,key(包括source和target)所在的path(通路)信息。

easy_input_links.csv represents the relationship between the source (miRNA) and the target (target gene), where the color of the line indicates the type of the source (source_type: high WM_Score and low WM_Score).

easy_input_nodes.csv contains the pathway information (path) for the keys (including both source and target).

links <- read.csv("easy_input_links.csv")
nodes <- read.csv("easy_input_nodes.csv")

# 获取所有通路的名字
# Get all pathway names
paths <- unique(nodes[nodes$path != "source", ]$path) 

# 把"source"排在前面
# Place "source" first  
nodes$path <- factor(nodes$path, levels = c("source", paths)) 

# 连线的颜色
# Line colors 
src_up_col <- "red"
src_dn_col <- "blue"

# target节点的颜色
# Target node colors 
tar_up_col <- "red"
tar_dn_col <- "blue"

开始画图

Plotting

1. 快速预览

1. Take a glance

重要提示!节点和边数据中不能包含列名 ‘node’、‘cross’、‘node.type’、‘x’、‘y’、‘degree’!

IMPORTANT! The colnames of ‘node’, ‘cross’, ‘node.type’, ‘x’, ‘y’, ‘degree’ MUST NOT BE included in nodes and edges!

# 使用crosslink函数创建网络布局对象
# Create network layout object using crosslink function
toy <- crosslink(
  nodes = nodes, 
  edges = links,
  cross.by = "path", 
  xrange = c(0, 10),
  yrange = c(-5, 5),
  spaces = "partition")

# 绘制网络布局图
# Plot the network layout
cl_plot(toy)

2. 按通路将靶点转换为圆形分布

2. Transform the targets into circle by pathways

# 自定义函数
# Custom function
toCircle <- function(x, y, rx = 1, ry =1, intensity = 2){
  mapTo2pi <- function(x) {scales::rescale(c(0, x), to = c(0, 2*pi))[-1]}
  data.frame(x, y) %>%
    mutate(group = paste0("group", x)) %>%
    mutate(yy = scales::rescale(-x, to = range(y))) %>%
    mutate(xx = mean(x) + intensity * sin(yy %>% mapTo2pi),) %>%
    group_by(group) %>%
    mutate(tri = rank(y, ties.method = "first") %>% mapTo2pi)  %>%
    ungroup() %$%
    data.frame(
      x = xx + rx*sin(tri),
      y = yy + ry*cos(tri))
}

# 应用圆形变换函数到网络布局对象
# Apply circular transformation to network layout object
toy_circle <- toy %>% tf_fun(
  crosses = paths, 
  along = "xy",
  fun = toCircle,
  rx = 0.2, ry = 0.2)

# 绘制变换后的网络图(不显示标签)
# Plot transformed network (without labels)
toy_circle %>% cl_plot(label = NA)

3. 后处理

3. Post-processing

## 对圆形布局进行几何变换
## Geometric transformations for circular layout
toy_final <- toy_circle %>% 
  tf_rotate(angle = -90) %>% 
  tf_flip(axis = "x", crosses = paths) %>%
  tf_shift(y = 8, crosses = paths, relative = F) %>%
  set_header()

## 可视化后处理结果
## Visualize post-processing results
toy_final %>% cl_plot(label = NA) %>% cl_void()

4. 微调

4. Fine tuning

# 显示可用图形属性
# Display available graphic attributes
show_aes(toy_final)
## Available meta.data names are showing below.
## Cross: node, node.type, x, y, cross, key, type, path, signif, degree 
## Link: src, tar, source_type, src.cross, tar.cross, source, target, src.degree, tar.degree, x, y, xend, yend 
## Header: node, node.type, x, y, cross, header
ggplot() +
  # 每个模块相对独立,可根据需要调整不同图层的叠加顺序
  # Each module is relatively independent, you can adjust the layer stacking order as needed
  
  # 路径的黑色圆圈(绘制在最底层,部分会被靶点覆盖)
  # Black circles for pathways (drawn at the bottom layer, partially covered by target points)
  ggforce::geom_circle(
    mapping = aes(x0 = x0, y0 = y0, r = r),
    data = get_cross(toy_final) %>% filter(cross != "source") %>% 
      group_by(path) %>%
      transmute(
        x0 = mean(x),
        y0 = mean(y),
        r = 0.2
      ) %>% unique(),
    show.legend = F
  ) +
  
  # 连线(miRNA-靶基因关系)
  # Connection lines (miRNA-target relationships)
  geom_segment(
    mapping = aes(x, y, xend = xend, yend = yend, color = source_type),
    data = get_link(toy_final),
    alpha = 0.3 
  ) + 
  
  # 靶点节点
  # Target nodes
  geom_point(
    mapping = aes(x, y, 
                  # size = size, 
                  color = type),
    data = get_cross(toy_final) %>% filter(cross != "source")
  ) +
  
  # 添加文字:靶点所属通路名称
  # Add text: Pathway names for targets
  ggrepel::geom_text_repel(
    mapping = aes(x, y, label = header), nudge_y = 0.3, 
    data = get_header(toy_final) %>% filter(cross != "source"),
    segment.color = NA
  ) +
  
  # 添加文字:miRNA源节点名称
  # Add text: miRNA source node names
  geom_text(
    mapping = aes(x, y, label = key), angle = 90, hjust = 1, nudge_y = -0.1,
    data = get_cross(toy_final) %>% filter(cross == "source")
  ) +
  
  # 添加文字:每个通路的靶点数量
  # Add text: Number of targets per pathway
  geom_text(
    mapping = aes(x, y, label = num),
    data = get_cross(toy_final) %>% filter(cross != "source") %>% 
      group_by(path) %>%
      transmute(
        x = mean(x),
        y = mean(y),
        num = n()
      ) %>% unique()
  ) +
 
  # 颜色配置
  # Color settings
  scale_color_manual(values = c(
      src_up = src_up_col, src_dn = src_dn_col, 
      tar_up = tar_up_col, tar_dn = tar_dn_col)) + 
  labs(x = NULL, y = "Target_Pathway") +
  scale_y_continuous(expand = expansion(mult = c(0.25,0.1))) -> p

p

如果想要像例文2那样给source也画上点,就运行下面这段

If you want to plot points for the source nodes as in Example 2, run the following code

# 画source节点
# Plot source nodes
p <- p + geom_point(
  mapping = aes(x, y),
  data = get_cross(toy_final) %>% filter(cross == "source")
  )

5. 添加注释图

5. Add annotation plots

把source的’signif’标注在source名字的下方

Place the ‘signif’ annotation below the source names

# 创建带注释的circLink图
# Create circLink plot with annotations
cl_plot2(
  p %>% cl_void(th = theme(
    axis.title = element_text())),
  object = toy_final, 
  annotation = cl_annotation(
    bottom = ggplot() +
      geom_text(
        mapping = aes(seq_along(key), 0, label = signif), 
        data = nodes %>% filter(path == "source")
      ) + theme_void() 
    ,
    bottom.by = "source", bottom.height = 0.05
  )
)

# 保存图形为PDF文件
# Save plot as PDF file
ggsave("circLink.pdf", width = 10, height = 5)

附:示例数据生成过程

Appendix: Example Data Generation Process

# 生成节点名称
# Generate node names
sources <- paste0("source", 1:20 %>% format)
targets <- paste0("target", 1:500 %>% format)
paths <- paste0("path", 1:15 %>% format)

# 创建节点数据框
# Create node dataframe
nodes <- data.frame(
  key = c(sources, targets),
  type = c(rep("src_up", length(sources)/2),
           rep("src_dn", length(sources)/2),
           sample(c("tar_up", "tar_dn"), length(targets), replace = T)),
  path = c(rep("source", length(sources)), 
           rep(paths, times = c(
             40, 50, 30, 30, 50, 50, 20, 30, 30, 40, 20, 30, 30, 20, 30
           ))) %>% factor(
             levels = c("source", paths)
           ),
  signif = c(sample(c("*", "**", "***", "ns"), length(sources), replace = T),
             rep(NA, length(targets)))
)

# 生成连接关系数据
# Generate link relationships
link_n <- 500
set.seed(666)
links <- data.frame(
  src = sample(sources, link_n, replace = T),
  tar = sample(targets, link_n, replace = T)) %>% 
  unique() %>%
  mutate(source_type = nodes$type[match(src, nodes$key)])

# 保存示例数据文件
# Save example data files
write.csv(links, "easy_input_links.csv", row.names = F, quote = F)
write.csv(nodes, "easy_input_nodes.csv", row.names = F, quote = F)

会话信息

Session Info

# 显示会话信息
# Show session information
sessionInfo()
## R version 4.4.1 (2024-06-14)
## Platform: x86_64-pc-linux-gnu
## Running under: Ubuntu 20.04.4 LTS
## 
## Matrix products: default
## BLAS:   /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.9.0 
## LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.9.0
## 
## locale:
##  [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
##  [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
##  [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
##  [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
##  [9] LC_ADDRESS=C               LC_TELEPHONE=C            
## [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       
## 
## time zone: Asia/Shanghai
## tzcode source: system (glibc)
## 
## attached base packages:
## [1] stats     graphics  grDevices utils     datasets  methods   base     
## 
## other attached packages:
##  [1] crosslink_0.1.0 lubridate_1.9.4 forcats_1.0.0   stringr_1.5.1  
##  [5] dplyr_1.1.4     purrr_1.0.4     readr_2.1.5     tidyr_1.3.1    
##  [9] tibble_3.3.0    ggplot2_3.5.2   tidyverse_2.0.0 magrittr_2.0.3 
## 
## loaded via a namespace (and not attached):
##  [1] yulab.utils_0.2.0  sass_0.4.10        generics_0.1.4     ggplotify_0.1.2   
##  [5] stringi_1.8.7      hms_1.1.3          digest_0.6.37      evaluate_1.0.3    
##  [9] grid_4.4.1         timechange_0.3.0   RColorBrewer_1.1-3 fastmap_1.2.0     
## [13] jsonlite_2.0.0     ggrepel_0.9.6      aplot_0.2.5        scales_1.4.0      
## [17] tweenr_2.0.3       textshaping_1.0.1  jquerylib_0.1.4    cli_3.6.5         
## [21] rlang_1.1.6        polyclip_1.10-7    withr_3.0.2        cachem_1.1.0      
## [25] yaml_2.3.10        tools_4.4.1        tzdb_0.5.0         gridGraphics_0.5-1
## [29] vctrs_0.6.5        R6_2.6.1           lifecycle_1.0.4    ggfun_0.1.8       
## [33] fs_1.6.6           MASS_7.3-61        ragg_1.4.0         pkgconfig_2.0.3   
## [37] pillar_1.10.2      bslib_0.9.0        gtable_0.3.6       glue_1.8.0        
## [41] Rcpp_1.0.14        systemfonts_1.2.3  ggforce_0.4.2      xfun_0.52         
## [45] tidyselect_1.2.1   rstudioapi_0.17.1  knitr_1.50         dichromat_2.0-0.1 
## [49] farver_2.1.2       htmltools_0.5.8.1  patchwork_1.3.0    rmarkdown_2.29    
## [53] labeling_0.4.3     compiler_4.4.1